EDA News (& Monday March 29, 2004 From: EDACafe ÿÿ Previous Issues _____ Cadence _____ About This Issue (& Hardware/Software Co-verification _____ March 22-26, 2004 By Dr. Jack Horgan Read business product alliance news and analysis of weekly happenings _____ In our personal and professional lives we are all familiar with application software like Internet browsers and office productivity tools. These programs along with EDA software are developed in high level languages (C/C++, Java, ..) and targeted to run on COTS (commercial off-the shelf) PCs and workstations and on popular operating systems (Windows, Unix, Linux). The software developers of these packages have access to powerful Integrated Design Environments (IDEs) for generating, managing and testing code. Programmers have the luxury of running their application in a stable environment and in real time under debug mode. They are largely isolated form the details of the hardware design. This is not the case for developers of embedded software. In particular they do not have access to the target hardware, until there is at least a physical prototype. This means that a significant portion of the software development effort is delayed, making the overall process more sequential and lengthier than desirable. In an era of shrinking product lifetimes and competitive time to market pressures, this is a serious issue. Also, problems or even improvements that could and should have been resolved in hardware remain undetected until software debug phase. At this stage the time and cost to fix via a hardware solution is excessive. At 130 nm, a mask set for a complex SoC exceeds $500,000. At 90 nm, pricing approaches $1,000,000. Implementing in software rather than hardware could translate into reduced performance and/or dropped functionality. Lastly, having firmware tested against ideal hardware (no manufacturing defects) prior to first silicon can be very helpful to hardware engineers in their early testing endeavors. These issues are becoming more serious as the amount of software content in embedded systems grows. As a measure of this growth consider the study in 2002 by VDC, a technology market research and strategy firm. They estimated the number of embedded software developers to be 236,800 and the number of embedded hardware developers to be 130,900. VDC estimated the number of embedded software developers to be growing at 8.3% a faster rate than the number of hardware developers at 4%. Industry experts see software costs equally hardware costs at 130 nanometers and exceeding it at 90 nanometers. Hardware verification is itself become more challenging. Verification times have increased with rising gate count and as overall design complexity grows. According to a survey by Collett International Research in 2002 only 39% of designs were bug free at first silicon, while 60% contained logic or functional flaws. More than 20% required 3 or more silicon spins. A Collett survey also showed that nearly 50% of total engineering time was spent in verification. In the traditional development process the hardware and software portions continue independently and in parallel with little communications between the two teams. An ideal solution would enable developers to do software verification against an accurate model of the silicon before first silicon is available, and with sufficient performance to run the complex software expected in an advanced device. This is referred to as co-verification. Since software is coded rather than synthesized, some may prefer the term co-simulation. This approach should shorten product development time through greater parallelism, reduce risk through earlier testing and improve the design through greater communication between software and hardware design teams. One approach to hardware verification is to use high end hardware emulation systems such as Mentor Graphics' Vstation (uses massive array of FPGAs) and Cadence's Palladium (based upon custom ASIC design) to drive simulation acceleration and in-circuit emulation. These systems deliver high-capacity and high performance real-system verification. They can scale above 100 million gates with performance up to 1 MHz in a simulation-like debug environment that allows 100% signal visibility into the design. The major draw back of these systems is their cost making them inappropriate or inaccessible for some design projects or smaller firms. While old fashioned bread boarding is not possible with today's SoCs and ASICs, one can develop custom hardware emulators based on FPGAs. Using FGGA synthesis and partitioning tools such as Certify from Synplicity an entire circuit can be mapped into one or more FPGAs, and the software development environment connected to the board via a standard JTAG interface. Unfortunately the lack of internal visibility precludes the debugging of the design hardware. This approach to rapid prototyping of course is its own hardware design project which consumes time, money and talent. Synplicity claims that a complete hardware prototype in the form of a FGGA-based board can be generated in under a month for less than $100K, including the tools. Time and cost would increase if multiple FPGA boards were required. There are a few small firms that offer a general purpose FPGA-based prototyping environment at a more reasonable price point than the high end hardware emulators described earlier. Charles Miller, Aptix SVP Marketing and Business Development, compared the roll-your-approach to walking a tightrope without a net. What happens if the FPGA based prototype doesn't come up when power is applied? Lauro Rizzatti, EVE-USA CEO, pointed out possible problems with clock distribution, memory mapping and limited FPGA I/O pins. The risk of the roll-your-own approach increases with the number of required FPGA boards. Also any change in the design can cause a rework of the custom prototype. One vendor of FPGA-based prototyping environment is Emulation and Verification Engineering (EVE), founded in France in 2000 by former execs at Meta Systems. EVE's product line is branded ZeBu, short for Zero Bugs, and consists of a family of H/W-assisted verification PCI platforms based on the Xilinx Virtex-II FPGAs. The design under test is mapped onto one or several ZeBu boards via Reconfigurable Test Bench (RTB) made up of additional Virtex-II FPGA's, SDRAM and SRAM chips, and proprietary firmware. The mapping is carried out through any one of the most popular commercial ASIC/FPGA RTL synthesis tools plus the software compilation package included with the ZeBu system to deal with gate-level partitioning, clocks, and memory models. For hardware software co-verification the processing units can be connected to a software debugger running on the same PC or on a separate PC/WS through a JTAG interface. ZeBu can execute software drivers, operating systems or applications at MHz speed, while providing hardware debugging capabilities. By connecting as many as 8 ZeBu boards the system can reach a maximum verification capacity of 12 million ASIC gates. Another vendor, Aptix, is a fifty person company based in Sunnyvale and founded in 1989. The company provides flexible hardware platforms for building reconfigurable pre-silicon prototypes (PSPs) based on proprietary Field Programmable Interconnect (FPIC) technology. A monolithic SoC design can be synthesized and partitioned across multiple FPGA devices and integrated with CPUs, DSPs, memories and other IP blocks to complete the circuit. Aptix provides three modes of operation supporting a variety of performance levels. In co-emulation mode, the user runs an RTL testbench against the hardware prototype running in the Aptix platform at tens of KHz. In vector mode, test vector sets are streamed into the hardware at speeds of hundreds of KHz. Finally, in transaction mode, system level models are run with the hardware to produce speeds measured in MHz. Aptix also offers Software Integration Station, a low-cost Field Programmable Circuit Board Replicate without the hardware probing and debugging features for distribution to software developers. Aptix supports and promotes a block-based approach to embedded SoC design. IP blocks are prototyped and validated independently using re-usable testbenches and co-emulation. Once validated, the IP blocks are easily assembled to produce a full system prototype. A third vendor is Axis Systems, whom Verisity acquired in February 2004 for $80 million in cash and stock. Axis had revenues greater than $20 million in its last reported fiscal year. Its emulation and simulation products based upon ReConfigurable Computing (RCC) technology include Xcite for up to 10 million ASIC gates at speeds of 100K cycles/sec, Xtreme for up to 100 million gates at speeds upto 1MHz and Xtreme-II for platform-based design flow. One problem that FPGA-based prototypes face is incompatibilities between ASIC and FPGA synthesis solutions. Usually, RTL code, synthesis constraints, scripts and the ASIC IP must be changed to move designs between the ASIC and the prototype. This is a time-consuming and error prone manual effort. On March 15th Synopsys announced Design Complier FPGA (DC FPGA), a new FGGA synthesis tool targeted at this problem. Built upon Design Compiler technology DC FPGA enables the integration of the ASIC and FPGA design environments. DC FPGA accepts the same RTL code, constraints, scripts and IP libraries as Design Compiler and provides the same interface to Formality formal verification. DC FPGA also offers Adaptive Optimization technology that automatically activates the best core synthesis algorithms based upon multiple parameters, then dynamically control and reorder how the algorithms are applied. In the press announcement SVP Antun Domic said that "Over forty percent of our customers are prototyping their ACISs in FPGAs". In the absence of hardware emulation or assistance one can use software or logic simulator such as Synopsys VCS, Mentor ModelSim or Cadence Incisive. In all these approaches the design under test is described using a hardware description language (HDL) such as Verilog or VHDL. These event-driven models are complied and then executed using a testbench that defines the stimuli or test vectors. They model system activity as a series of events that occur asynchronously and at irregular intervals. An improvement on this approach is cycle-based simulation which ignores timing and evaluates the state of logic once during each simulation cycle. By ignoring intra-cycle state transition, cycle-based simulation can achieve speeds of 10 to 50 times faster than event-driven simulation. Initially cycle-based algorithms imposed design restrictions such as single clock and synchronous that limited the applicability of cycle-based simulation technology for many of the SoC designs. Testbenches were originally written in languages such Verilog or VHDL. Today there are specialized languages for testbenches such as OpenVera from Synopsys and e from Verisity. Testbenches include stimulus injectors, results capture, results predictors, pattern generators and others elements to tie together or drive the components. Tools for testbench automation are commercially available. What about the software? The software can be complied and executed on a host computer where the simulation is being run. The model of the software will not be cycle accurate because the instruction set is being run on an entirely different system. Another approach is to use an Instruction Set Simulator (ISS) that emulates the behavior of the CPU. The ISS typically includes all the target processor's registers and system state, and as such is a complete model of the target processor. The ISS performs the fetch-decode-execute cycle for the target CPU's instruction set on the host processor. The mechanism to link the software and hardware simulations is a Bus Functional Model (BFM) or Bus Interface Model (BIM). The process model is connected to the design at the pin level and a configuration file within the pin interface model handles timing in and out of the model. The pin interface is driven by the BFM which in turn is driven by the software execution environment. There is a mismatch between the simulation and real life in that an event driven simulator would spend most of its time in servicing events related to CPU and memory interaction (instruction fetches, reads/writes to memory and I/O operations of memory mapped peripherals), while a real CPU would spend most of its time in internal operations. Since memory interactions are repetitive, little is gained by re-simulation. One could improve simulation performance, if these memory interactions could be serviced by a higher performance mechanism such as processing them against a memory array. Another technique is clock suppression during periods where the software is executing in a manner that does not advance the state of the hardware. Mentor offers Seamless co-verification to enable HW/SW integration and verification to take place on a virtual prototype of the system. Seamless maintains a unique storage array, the Coherent Memory Server, for the memory space that is accessed frequently by the processor. This lets the user choose whether access to a given memory range is serviced by the logic simulator or intercepted by Seamless. The user can switch memory access modes at any time during simulation to achieve either detailed simulation of memory cycles or rapid execution of software, depending on the situation. Seamless makes use of memory models available from Denali. Seamless has processor support packages for over 100 microprocessors and DSPs that include an ISS, BIM and graphical software debugger. Seamless can display charts of software profiling, memory transactions, bus transactions, and bus-arbitration delay. At a recent seminar Mentor cited the example where booting LynxOS for a device under Seamless took 9 minutes compared to 233 hours for unassisted RTL emulation. A technique that can improve the performance of all of the approaches discussed thus far is transaction based modeling. Traditional verification environments have been event based, meaning they have to provide data every clock cycle or even every sub-cycle. Transaction-based verification is a technique in which large amounts of data representing single or multiple clock cycles can be passed into simulation without multiple calls. Transactions represent a higher level of abstraction and address architecturally visible data types. For example, an Ethernet transaction would deal with an entire Ethernet packet, while a PCI DMA bus transaction would deal with an entire burst transfer. Current verification methods use bus functional models (BFMs) or bus transactors to drive signals onto the design's interfaces using the correct interface protocols. The data exchange between the hardware and software is typically done through an API known as Standard Component Emulator Interface (SCE-MI), a standard sponsored by Accellera. This standard interface provides a mechanism to insure portability of the environment forward to new emulation platforms, or portability of third party verification IP into the environment. The major benefits for transaction based verification are ease of generating testbenches, reusability, and significantly improved execution speed. Zaiq Technologies, Inc. of Woburn, MA offers a front to end line of specialized design and verification products and services. The company's SYSTEMware product family provides designers with a pre-configured, rapidly usable transaction-based system level verification environment for complex designs, along with an extensive library of verification IP to comprehensively and efficiently test those designs. SYSTEMware supports the SCE-API standard. Zaiq has partnerships with both EVE and Aptix. (& The potential benefits of co-verification are compelling. The obstacle has been to find an affordable solution that provides acceptable performance on the software side. The figure above shows the relative performance of the various approaches to simulating hardware and software. The fundamental issue is that software emulation is orders of magnitude faster than unassisted hardware simulation. The base case or unaccelerated method for executing software on simulated hardware is to load the object code into an array declared in an HDL module and to allow the models to execute on their own, fully simulating all the I/O activities. This translates into software execution speeds of 10 to 100 seconds. This is simply too slow to support software testing. Using Seamless with full optimization, software intensive functions can achieve 10,000 to 75,000 instructions per second depending upon the speed of the ISS. Simulation with mixed software and hardware cycles typically achieve a rate of 1,000 to 5,000 instructions per seconds. With hardware assisted emulation speeds in the MHz range can be achieved which are sufficient for firmware and some application verification. A second important performance metric is compilation time for RTL simulation which can be excessive. This is particularly of concern in early stages where change is frequent. As the cost performance metric improves and as design complexity and software content rise, the interest in software/hardware co-verification will increase dramatically. Weekly Industry News Highlights EVE Named ARM EDA Partner, RealView Model Library Access Program Member Nassda's HSIM Adopted by Sirific Wireless for RF Design Verification Fujitsu to Manufacture Leading-Edge FPGA Products for Lattice Semiconductor VSIA Functional Verification Specification Released to Membership; Document Drives Reuse in Verification Environment Highest Performing Floating-Point DSP from TI Offers 33 Percent Performance Increase StarCore and VaST Systems Technology Collaborate on StarCore Processor Model That Speeds Wireless and Consumer Electronics Time to Market Synopsys and Jungo Collaborate to Offer Complete USB Full Speed OTG Solution With IP and Software Mentor Graphics Launches Analysis Tools for Design of Automotive and Aerospace Wiring Systems Motorola Expands Use of Synopsys Phase-Shifting on 90-Nanometer Technology Node Actel Expands MIL-STD-1553 Offering With New IP Core for Military, Space and Industrial Markets Virage Logic Teams with Chartered and IBM to Provide Highly Differentiated IP Portfolio for Joint 90nm Manufacturing Process Platform EDA Industry Reports 13% Revenue Growth in 4th Quarter More EDA in the News and More IP & SoC News Upcoming Events... --Contributing Editors can be reached by clicking here . You are registered as: [dolinsky@gsu.by]. CafeNews is a service for EDA professionals. EDACafe respects your online time and Internet privacy. To change your newsletter's details, including format and frequency, or to discontinue this service, please navigate to . If you have questions about EDACafe services, please send email to edaadmin@ibsystems.com . Copyright c 2004, Internet Business Systems, Inc. - 11208 Shelter Cove, Smithfield, VA 23420 - 888-44-WEB-44 - All rights reserved.